Breath Fingerprint of Colorectal Cancer Patients Based on the Gas Chromatography–Mass Spectrometry Analysis

The human body emits a multitude of volatile organic compounds (VOCs) via tissues and various bodily fluids or exhaled breath. These compounds collectively create a distinctive chemical profile, which can potentially be employed to identify changes in human metabolism associated with colorectal cancer (CRC) and, consequently, facilitate the diagnosis of this disease. The main goal of this study was to investigate and characterize the VOCs’ chemical patterns associated with the breath of CRC patients and controls and identify potential expiratory markers of this disease. For this purpose, gas chromatography–mass spectrometry was applied. Collectively, 1656 distinct compounds were identified in the breath samples provided by 152 subjects. Twenty-two statistically significant VOCs (p-xylene; hexanal; 2-methyl-1,3-dioxolane; 2,2,4-trimethyl-1,3-pentanediol diisobutyrate; hexadecane; nonane; ethylbenzene; cyclohexanone; diethyl phthalate; 6-methyl-5-hepten-2-one; tetrahydro-2H-pyran-2-one; 2-butanone; benzaldehyde; dodecanal; benzothiazole; tetradecane; 1-dodecanol; 1-benzene; 3-methylcyclopentyl acetate; 1-nonene; toluene) were observed at higher concentrations in the exhaled breath of the CRC group. The elevated levels of these VOCs in CRC patients’ breath suggest the potential for these compounds to serve as biomarkers for CRC.


Introduction
In the year 2020, colorectal cancer (CRC) stood as the third most diagnosed cancer globally and the second principal cause of cancer-related mortality [1].In that year, estimations indicated over 1.9 million new cases of CRC and approximately 935,000 related fatalities [1].The 5-year survival rate for CRC can soar to 90% when the condition is detected in its early stages, significantly elevating the chances of a favorable outcome.Early detection is crucial in improving patient outcomes and reducing mortality rates, making it a priority in any national health system initiative to develop a dependable and efficient screening tool.
CRC screening modalities do exist and are recommended for routine clinical applications in most developed countries, but there is still definite room for improvement.Presently, two non-invasive screening methods are used: a stool-based test that detects occult blood, namely the guaiac-based fecal occult blood test (gFOBT), and the fecal immunochemical test (FIT).FIT is more sensitive than gFOBT in detecting precancerous lesions and cancer, leading to a strong recommendation for prioritizing FIT over gFOBT [2][3][4][5].Although FIT is currently the non-invasive test of choice, variability in cutoff levels for positive results leads to inconsistencies in diagnosis and complicates the establishment of a clinical standard [6,7].Additionally, FIT is not perfectly sensitive, especially in detecting adenomas [8].Despite the availability of CRC screening methods, participation in public screening programs is low due to psychological or physical discomfort associated with the tests and the need for improved accuracy.Colonoscopy is considered the gold standard, serving dual purposes as both a primary screening tool and a follow-up procedure for individuals who have tested positive through other screening methods [9,10].While highly sensitive, it is time-and resource-demanding and may cause complications such as bowel perforation, bleeding, dehydration due to bowel preparation, and cardiovascular events due to sedation [11,12].
Newer screening techniques include multitarget stool DNA testing (FIT-DNA), which combines FIT with the analysis of altered DNA biomarkers in stool cells.This approach has a significantly higher cancer detection rate compared to FIT alone but falls short in terms of specificity, potentially leading to an increased number of unnecessary colonoscopies [13].Serology tests designed to identify circulating methylated SEPT9 DNA present comparatively lower sensitivity, with a standardized sensitivity of 48.2% [14].
These limitations underscore the need for an alternative non-invasive, cost-effective, low-risk, and highly sensitive screening test to prevent overdiagnosis.One promising method involves the use of volatile organic compounds (VOCs) in exhaled breath, which has shown encouraging results [15][16][17][18][19].The concept is based on the premise that VOCs in human breath are indicators of metabolic processes and diseases.These compounds, identifiable in various biological substances including tissues, urine, blood, and, notably, exhaled breath, suggest that cancer cells release VOCs into the bloodstream.Subsequently, these compounds are excreted through the lungs and can be detected in exhaled air [20][21][22].
Investigating cancer-specific VOCs in biological fluids is a promising research direction.Previous studies have identified cancer-related VOCs in patients with various types of cancer, including stomach, breast, lung, and CRC [23][24][25][26][27]. Techniques such as gas chromatography-mass spectrometry (GC-MS), ion mobility spectrometry (IMS), proton transfer reaction mass spectrometry (PTR-MS), and electronic nose (e-nose) have been employed for analysis [28].Numerous studies exploring the viability of VOC analysis as a CRC screening tool have yielded promising outcomes [15,29].
The main goal of this study was to investigate and characterize the VOC chemical patterns associated with the breath of CRC patients and to identify potential expiratory markers of this disease.

Results
In total, the study encompassed a total of 78 patients diagnosed with CRC and 74 individuals serving as controls; the median age of the study subjects being 63.Notably, within this cohort, analysis revealed no statistically significant disparity in the prevalence of CRC between genders, as indicated by a p-value of 0.836.
The CRC cohort exhibited diverse clinical stages and varying degrees of cancer differentiation in colorectal adenocarcinoma.A comprehensive account of the clinical features of the participants is outlined in Table 1 for a more detailed insight into the distinct characteristics observed within the study group.Altogether, the analysis of breath samples from 152 subjects revealed the presence of 1656 distinct compounds.Among these, 1210 were identified in samples obtained from CRC patients, and 1267 were found in samples provided by control subjects.
The VOCs, systematically categorized based on their chemical classes and exhibiting an incidence rate exceeding 50% are outlined in Table 2, offering a detailed representation of the prevalent constituents in the analyzed breath samples.Within both groups, aldehydes, esters, and ketones emerged as the predominant classes of VOCs.These were succeeded by hydrocarbons, alcohols, aromatics, and heterocycles.The variability in the number of identified compounds per sample was evident, with subjects exhibiting a range from 50 to 93.This observation underscores the diverse composition of VOCs among individuals, emphasizing the intricate nature of the analyzed samples.The distribution of VOCs based on their chemical classes was similar in the two groups under study.Table 3 lists the compounds with an occurrence exceeding 30%.In both patients and controls, the distribution of VOCs was comparable, with aromatics being the dominant class, comprising 12 compounds.Out of the 1656 compounds detected, we selected 21 statistically significant VOCs that were observed at higher concentrations in the exhaled breath of the CRC group compared to the controls.These compounds are listed in Table 3.The VOCs are ordered according to increasing p-values from the Wilcoxon ranksum test, and only those compounds where the difference between groups was statistically significant are included.As indicated by the data, the levels of all these compounds were higher in the cancer group.Additional identified VOCs can be found in the Supplementary Information, Table S1.The violin plot (Figure 1) illustrates the distribution of chemical spike areas across two distinct group-a control group and cancer group.
Each compound may uniquely contribute to the overall breath fingerprint, reflecting specific metabolic and biochemical changes in CRC, such as changes in lipid metabolism and increased oxidative stress.For example, higher levels of aldehydes, such as hexanal, might result from cell membrane fatty acid peroxidation due to reactive oxygen species (ROS) [30,31].Ketones, such as cyclohexanone and 2-butanone, are formed because of increased fatty acid oxidation [32].Aromatic compounds, such as toluene and benzene, are commonly associated with the breakdown of cellular components and could reflect the increased cell turnover in cancer [33].
However, it is important to note that there is no consensus yet on the most prevalent VOCs in CRC patients' breath, and research is expanding to other biological materials like urine, blood, feces, and cancer tissues [20,[33][34][35].Studies in these areas, such as Wen Qing et al.'s research on urinary VOCs in cancer, have identified different predominant VOCs, suggesting that the VOC profile might vary with the biological material examined [33].In contrast, our previous study, which focused on comparing VOCs released in cancerous versus non-cancerous tissues, revealed a predominance of hydrocarbons and alcohols, with aldehydes, ketones, and aromatic compounds following in prevalence [20].
Each compound may uniquely contribute to the overall breath fingerprint, reflecting specific metabolic and biochemical changes in CRC, such as changes in lipid metabolism and increased oxidative stress.For example, higher levels of aldehydes, such as hexanal, might result from cell membrane fatty acid peroxidation due to reactive oxygen species (ROS) [30,31].Ketones, such as cyclohexanone and 2-butanone, are formed because of increased fatty acid oxidation [32].Aromatic compounds, such as toluene and benzene, are commonly associated with the breakdown of cellular components and could reflect the increased cell turnover in cancer [33].
However, it is important to note that there is no consensus yet on the most prevalent VOCs in CRC patients' breath, and research is expanding to other biological materials like urine, blood, feces, and cancer tissues [20,[33][34][35].Studies in these areas, such as Wen Qing et al.'s research on urinary VOCs in cancer, have identified different predominant VOCs, suggesting that the VOC profile might vary with the biological material examined [33].In contrast, our previous study, which focused on comparing VOCs released in cancerous versus non-cancerous tissues, revealed a predominance of hydrocarbons and alcohols, with aldehydes, ketones, and aromatic compounds following in prevalence [20].

Altomare et al.'s research, focusing on breath VOCs with high discriminant power
for CRC, identified key VOCs (tetradecane, ethylbenzene, and benzaldehyde) that overlap with those found in our study, lending further credibility to these compounds as potential biomarkers [17].Additionally, Śmiełowska et al.'s study on both breath and fecal samples from CRC patients demonstrates not only the diversity in VOCs but also the differences in their concentrations across different sample types, indicating that the VOC profile might be more pronounced in breath samples [36].
Wang Changsong et al. [37] identified cyclohexanone as a notable VOC in CRC patients' breath, which is consistent with our findings.In our study, along with cyclohexanone, we also detected tetradecane, dodecanal, and 2-butanone.These biomarkers were previously identified in our research on VOC emissions from cancerous versus normal colon tissues, though in that study these four compounds were found at reduced concentrations in cancerous tissue compared to healthy tissue [20].This discrepancy calls for further investigation, taking into account factors such as the mixing of room air VOCs with exhaled air, the solubility of VOCs in blood, and other variables [38].
The detection of VOCs like hexanal, linked to lung cancer, and 2-butanone, associated with lung and breast cancer, highlights the broader implications of VOCs in cancer detection [39].This underscores the importance of understanding the complex nature of VOC profiles, which vary significantly in health and disease [40,41].
Our research adds to the existing knowledge base by introducing new VOCs and corroborating the findings of similar VOCs in CRC from other studies.However, the challenge remains in distinguishing cancer-specific VOCs or groups due to their diverse chemical nature and varying presence.Factors like gut microbiota, diet, and other health conditions can influence VOC profiles, complicating their interpretation [42].
Several limitations of this study should be mentioned.Firstly, the relatively small sample size of 152 individuals, coupled with the recruitment of control participants from one clinic and CRC patients from a single hospital in the same city, while sufficient for initial analysis, raises concerns about the geographical and genetic diversity of breath profiles and the broader applicability of our findings.Secondly, another significant limitation is the lack of consideration for the stage of CRC in the analysis of VOCs.The absence of detailed data on the proportion of patients with late-stage versus early-stage adenocarcinoma limits our ability to draw comprehensive conclusions about the disease stages.Lastly, the way samples were stored and processed could have affected the chemical patterns that were observed.Despite these challenges, the exploration of CRC patients' breath fingerprint through GC-MS analysis is an innovative approach with significant potential in cancer diagnostics.While there are limitations, the promising results and the non-invasive nature of this method make it an exciting area for future research and development in oncology.

Chemical Standards and Quality Benchmarks
All reference mixtures were generated using high-purity liquid chemicals with stated purities ranging from 95% to 99.9%, sourced from Merck (Wien, Austria).The preparation of standards involved a two-step process.Initially, a few microliters of a liquid compound were introduced into evacuated and heated 1 L glass bulbs (Supelco, Toronto, Canada) to produce primary standards.Once the compounds had evaporated, the bulb pressure was equalized using nitrogen.Subsequently, the primary standards were diluted by transferring precise volumes from the bulb mixtures into 3-25 L Tedlar bags (SKC Inc., Eighty Four, PA, USA).These bags had been prefilled with purified and humidified air (with a relative humidity of 100% at 34 • C).The standards were sampled within a 30 min time window after production.
We acquired stainless steel industry-standard thermal desorption tubes (1/4 inch outer diameter, 3½ inches long) from Markes International (Bridgend, UK).These tubes were prefilled with Tenax TA (Bridgend, UK) and coated with SilcoNert™ (Bellefonte, PA, USA).Before each sampling event, the sorbent tubes underwent reconditioning procedures following the manufacturer's guidelines.

Study Group Description and Recruitment Process
The study enrolled individuals diagnosed with CRC based on confirmed morphological assessments, including patients with confirmed adenocarcinoma, who donated a breath sample before undergoing surgical treatment.The control cohort consisted of individuals without high-risk precancerous lesions or colorectal adenocarcinoma, all of whom had undergone a colonoscopy.The definitive categorization for the study was determined following the examination of the morphological report.
Participants were sourced from two medical facilities: the Riga East Clinical University Hospital within the Oncology Center of Latvia and the Digestive Diseases Centre GASTRO in Riga, Latvia.
The enrollment criteria included individuals aged 18 and above who provided signed consent forms.Exclusion criteria were established to minimize potential confounding factors from other medical conditions.Individuals with concurrent active malignancies, a history of complete bowel cleansing, inflammatory bowel diseases, previous bowel resection, ongoing neoadjuvant chemotherapy and/or radiation therapy, acute conditions requiring emergency surgery, chronic renal failure stage 4, type I diabetes, and active bronchial asthma were excluded.

Breath Sample Collection
Breath samples were taken in a designated room that was free from any chemicals, cleaning agents, medications, solvents, or kitchen waste.Samples were taken at room temperature.
To reduce the impact of possible variables that could interfere with the accuracy of exhaled breath analysis, participants were provided with clear guidelines.They were instructed to adhere to certain practices, including fasting overnight, refraining from smoking and alcohol consumption, avoiding gum chewing, and abstaining from physical activity for at least two hours before providing breath samples.Furthermore, participants were advised not to use perfume until after the collection of their breath samples.These measures were implemented to ensure the reliability and integrity of the collected breath samples for analysis.
Breath samples were collected using a custom-designed breath sampler illustrated in Figure 2.This sampler comprised a single-use mouthpiece (Intersurgical) attached to a disposable elbow (Intersurgical) and the CO 2 sensor cell (Masimo, Irvine, CA, USA; IRMA, Seattle, WA, USA) connected to the opposite end of the elbow.The elbow featured a 1/4" port, facilitating the attachment of industry-standard ¼" sorbent tubes.Directly before sampling, the sampling end of a sorbent tube was inserted into the elbow so that it protruded 5-6 mm into its interior and secured with a ¼" PTFE nut.The other end of the sampling tube was connected to a 250 mL glass syringe (Socorex, Switzerland) using a 1/8" Teflon (Wilmington, DE, USA) tube.Participants could freely inhale/exhale through a mouthpiece without encountering pneumatic resistance.
Samples were taken manually via drawing a volume of 10-15 mL during the endtidal phase of an exhalation, as determined by CO 2 measurements.Ultimately, a total of 500 mL of breath was collected from a single subject over 20-30 subsequent exhalations.Immediately after sampling, both ends of the sorbent tube were sealed with brass ¼" nuts, and the tubes were frozen at −80 • C. Samples were stored at −80 • C and transported on dry ice, with efforts made to minimize storage time to 4 weeks.
Relative standard deviations (RSDs) were calculated using 5 consecutively analyzed breath samples obtained from healthy volunteers.RSDs varied from 4 to 22%, which are considered adequate for the purposes of this study.
Immediately after sampling, both ends of the sorbent tube were sealed with brass ¼" nuts, and the tubes were frozen at −80 °C.Samples were stored at −80 °C and transported on dry ice, with efforts made to minimize storage time to 4 weeks.
Relative standard deviations (RSDs) were calculated using 5 consecutively analyzed breath samples obtained from healthy volunteers.RSDs varied from 4 to 22%, which are considered adequate for the purposes of this study.

Gas Chromatography-Mass Spectrometry Examination of Breath Samples
A two-stage thermal desorption was performed using a thermal desorber and autosampler (TD100, Markes International Limited, Cardiff, UK).First, Tenax tubes were heated to 280 °C for 6 min under the constant flow of helium 6.0 (99.9999%) at 20 mL/min

Gas Chromatography-Mass Spectrometry Examination of Breath Samples
A two-stage thermal desorption was performed using a thermal desorber and autosampler (TD100, Markes International Limited, Cardiff, UK).First, Tenax tubes were heated to 280 • C for 6 min under the constant flow of helium 6.0 (99.9999%) at 20 mL/min to desorb volatiles, that were next refocused in a cold trap packed with graphitized carbon black and maintained at 5 • C. The final injection of VOCs into the capillary column was achieved via the rapid heating of the cold trap to 320 • C for 1.5 min in a spitless mode.
The VOC separation and analysis were performed using an Agilent 7890A/5975C GC-MS system (Agilent, Santa Clara, CA, USA).Volatiles were separated using an Rxi-624Sil MS column (30 m × 0.32 mm, layer thickness 1.8 µm, Restek, Centre County, PA, USA) operated in constant helium flow of 1.5 mL min −1 .The GC oven temperature program was as follows: 40 • C for 10 min, followed by 5 • C min −1 up to 150 • C, hold for 5 min, then 10 • C min −1 up to 280 • C, and isotherm at 280 • C for 5 min.The untargeted VOC analysis was performed using the mass spectrometer working in a SCAN mode with the associated m/z ranging from 20 up to 250.The peak integration was based on extracted m/z ratio chromatograms and such an approach allowed for the separation of the majority of peaks of interest from their neighbors.The quadrupole, ion source, and transfer line were kept at 150 • C, 230 • C, and 280 • C, respectively.

Statistical Data Analysis
Due to the deviation of VOC level values from a normal distribution, a non-parametric Wilcoxon rank-sum test was employed to assess and compare the measured VOC levels.
For this purpose, the breath gradient of the VOCs was used, i.e., the difference between the VOC level in breath and room air.A comparison was drawn between individuals with CRC and those without cancer, with significance set at a threshold of p < 0.05.Moreover, only VOCs with occurrences higher than 20% were taken into consideration.In this study, an untargeted analysis was performed to pinpoint volatile markers of CRC.Thus, the statistical analysis relied on the relative quantification and peak areas of detected metabolites were used as the parameter in the analysis.For the purposes of this study, limit of detection (LOD) was defined as three times the noise amplitude and only peaks with signal-to-noise ratio larger than 9 (3 × limit of quantification (LOD)) were taken into account.
The results of this study provide compelling evidence that VOCs can be released in exhaled breath and serve as potential biomarkers for the presence of CRC.The identification of specific VOCs in the breath of CRC patients through GC-MS analysis offers valuable insights into the potential development of a breath-based diagnostic tool.Accurate identification of the VOCs linked to CRC is crucial for steering and refining the development of advanced sensor technologies.The distinct breath fingerprint associated with CRC holds promise for early detection and monitoring, presenting a non-invasive and patient-friendly approach to improving clinical outcomes.The findings underscore the significance of continued research in this field, as it is essential for translating these discoveries into robust and reliable diagnostic tools for CRC.

Figure 1 .
Figure 1.Comparison of Chemical Spike Areas Between Control and Cancer Groups: Violin Plot Analysis; blue represents the control group, while red represents the cancer group; Y-axis denotes the area values, with dots indicating the median for each group and the darker area indicates 25th to 75th percentile.Figure shows the most common VOCs in groups.

Figure 1 .
Figure 1.Comparison of Chemical Spike Areas Between Control and Cancer Groups: Violin Plot Analysis; blue represents the control group, while red represents the cancer group; Y-axis denotes the area values, with dots indicating the median for each group and the darker area indicates 25th to 75th percentile.Figure shows the most common VOCs in groups.

Figure 2 .
Figure 2. Scheme of sampling system used in the study and consisting of mouthpiece, CO2 control sensor, sorbent tube, and a syringe for drawing air.

Figure 2 .
Figure 2. Scheme of sampling system used in the study and consisting of mouthpiece, CO 2 control sensor, sorbent tube, and a syringe for drawing air.

Table 1 .
Gender and cancer stage and grade distribution in the study cohort.

Table 2 .
Volatile organic compounds categorized by chemical classes, with occurrence above 50%.

Table 3 .
The list of breath compounds exhibiting differences between CRC patients and controls.